Using a Priori Information for Fast Learning Against Non-stationary Opponents

نویسندگان

Pablo Hernandez-Leal

Enrique Munoz de Cote

Luis Enrique Sucar

چکیده

For an agent to be successful in interacting against many different and unknown types of opponents it should excel at learning fast a model of the opponent and adapt online to non-stationary (changing) strategies. Recent works have tackled this problem by continuously learning models of the opponent while checking for switches in the opponent strategy. However, these approaches fail to use a priori information which can be useful for a faster detection of the opponent model. Moreover, if an opponent uses only a finite set of strategies, then maintaining a list of those strategies would also provide benefits for future interactions, in case of opponents who return to previous strategies (such as periodic opponents). Our contribution is twofold, first, we propose an algorithm that can use a priori information, in the form of a set of models, in order to promote a faster detection of the opponent model. The second is an algorithm that while learning new models keeps a record of them in case the opponent reuses one of those. Our approach outperforms the state of the art algorithms in the field (in terms of model quality and cumulative rewards) in the domain of the iterated prisoner’s dilemma against a non-stationary opponent that switches among different strategies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Against Non-Stationary Opponents in Double Auctions

Energy markets are emerging around the world. In this context, the PowerTAC competition has gained attention for being a realistic and powerful simulation platform that can be used to perform robust research on retail energy markets. Agent in this complex environment typically use different strategies throughout their interaction, changing from one to another depending on diverse factors, for e...

متن کامل

Unifying Convergence and No-Regret in Multiagent Learning

We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...

متن کامل

Learning in games with more than two players

We address the problem of learning in repeated N-player (as opposed to 2-player) general-sum games. We describe an extension to existing criteria focusing explicitly on such settings. While there have been several criteria proposed recently for evaluating learning algorithms in multi-agent systems, most of this work has focused on the two-player setting. Relatively little work has addressed sit...

متن کامل

Opponent Modeling against Non-stationary Strategies: (Doctoral Consortium)

Most state of the art learning algorithms do not fare well with agents (computer or humans) that change their behaviour in time. This is the case because they usually do not model the other agents’ behaviour and instead make some assumptions that for real scenarios are too restrictive. Furthermore, considering that many applications demand different types of agents to work together this should ...

متن کامل

Performance Bounded Reinforcement Learning in Strategic Interactions

Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and non-existence of reasonable guarantees in the performance of selfadapting software. In this paper we address the latter issue in the con...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Using a Priori Information for Fast Learning Against Non-stationary Opponents

نویسندگان

چکیده

منابع مشابه

Learning Against Non-Stationary Opponents in Double Auctions

Unifying Convergence and No-Regret in Multiagent Learning

Learning in games with more than two players

Opponent Modeling against Non-stationary Strategies: (Doctoral Consortium)

Performance Bounded Reinforcement Learning in Strategic Interactions

عنوان ژورنال:

اشتراک گذاری